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Granger causality (GC) is a powerful metliod for causal inference for time series. In general, 
the GC value is computed using discrete time series sampled from continuous-time 
processes with a certain sampling interval length r, i.e., the GC value is a function of 
T. Using the GC analysis for the topology extraction of the simplest integrate-and-fire 
neuronal network of two neurons, we discuss behaviors of the GC value as a function of 
T, which exhibits (i) oscillations, often vanishing at certain finite sampling interval lengths, 
(ii) the GC vanishes linearly as one uses finer and finer sampling. We show that these 
sampling effects can occur in both linear and non-linear dynamics: the GC value may 
vanish in the presence of true causal influence or become non-zero in the absence of 
causal influence. Without properly taking this issue into account, GC analysis may produce 
unreliable conclusions about causal influence when applied to empirical data. These 
sampling artifacts on the GC value greatly complicate the reliability of causal inference 
using the GC analysis, in general, and the validity of topology reconstruction for networks, 
in particular. We use idealized linear models to illustrate possible mechanisms underlying 
these phenomena and to gain insight into the general spectral structures that give rise 
to these sampling effects. Finally, we present an approach to circumvent these sampling 
artifacts to obtain reliable GC values. 
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1. INTRODUCTION 

There have been great advances in experimental observational 
techniques in neuroscience, from single-unit, multi-unit, local 
field potential (LFP), to non-invasive electroencephalogra- 
phy (EEG), magnetoencephalography (MEG), and functional 
Magnetic Resonance Imaging (fMRI). These experimental 
tools have expanded the examination of physiological cor- 
relations of neuronal responses to stimuli to causal effects 
of how different parts of the nervous system influence one 
another. One of the fundamental questions is how one 
can infer causal relations between neurons or between neu- 
ronal populations through measured time series, which rep- 
resent neural activities of different neurons or neuronal 
populations. 

There is a long history of studying causal relations between 
time series. Granger causality (GC) has been developed to study 
how one time series is influencing the other. Recently GC 
has been applied to extract causal or structural information 
from time series (Strogatz, 2001; Boccaletti et al., 2006; Bressler 
and Seth, 2011). It has been applied to analyze times series 
obtained through invasive techniques, such as single-unit, multi- 
unit (Passaro et al, 2008), LFP (Brovelli et al, 2004; Bressler 
et al, 2007) recordings as well as non-invasive techniques, such 
as EEG (Astolfi et al, 2006), MEG (Gow et al, 2008), and fMRI 
(Roebroeck et al., 2005; Deshpande et al., 2008; Hamilton et al.. 



2010). The GC theory aims to analyze causal influence of one time 
series Xt on the other Yt by examining whether the prediction 
of Yt can be improved upon the incorporation of information 
of Xt (Wiener, 1956; Granger, 1969; Geweke, 1982). Although 
GC has proven to be a powerful framework to study directed 
causal connectivity within the brain, there are many challenges 
in its application to acquire reliable results. Usually the applica- 
tion of GC requires the time series to be linear. When time series 
are linear and Gaussian, GC is equivalent to the transfer entropy 
(Barnett et al., 2009; Bressler and Seth, 2011). For non-linear, 
non-Gaussian time series, it remains an active research field to 
study the applicability of the GC. Note that, the notion of GC is 
statistical rather than structural, i.e., through statistical features of 
responses, it identifies directed statistical functional connectivity, 
which may be different from physical causal interactions in the 
systems. 

As a statistical method, one of the important issues in the GC 
analysis is how to sample the dynamics in order to obtain suit- 
able time series for a reliable result from GC analysis. It has been 
pointed out that GC is not invariant to the sampling interval 
(McCrorie and Chambers, 2006). Spurious causality may arise 
if sampling is not sufficiently fine to capture the evolution of 
dynamical variables (McCrorie and Chambers, 2006). In its appli- 
cation to the analysis of fMRI data, one is confronted with the 
unavoidable issue of low-pass filtering and down-sampling in 
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processing the BOLD signal (Roebroeck et al., 2005; Danks and 
Plis, 2013). 

However, the directed statistical functional connectivity as 
revealed in the GC analysis is intimately related to the struc- 
tural connectivity of networks. Previously, we have applied the 
GC analysis to reconstruct network structural connectivity and 
our results demonstrate that the GC analysis can be successfully 
used to reconstruct the coupling structures of the integrate- 
and-fire (I&F) networks (Zhou et al, 2013b, 2014). We note in 
passing that for smoothly coupled oscUlator networks (with or 
without noisy inputs), techniques including chaotic synchroniza- 
tion or phase dynamics have been used to uncover the struc- 
tural connectivity via random phase reset or their responses to 
perturbations (Yu et al, 2006; Timme, 2007; Napoletani and 
Sauer, 2008; Smirnov and Bezruchko, 2009; Ren et al, 2010; 
Levnajic and Pikovsky, 2011), whereas, for some simple non- 
smooth network dynamics, e.g., (5-pulse coupled, current-based 
integrate-and-fire systems, their topologies can be reconstructed 
by designing a specific set of external inputs (Bussel et al., 201 1). 
However, in general, it remains a challenging task to directly 
uncover the network topology for a non-smooth, non-linear sys- 
tem, such as the conductance-based I&F networks, given the 
condition that only the dynamical activity at individual nodes 
can be measured. The GC analysis in our network construction 
uses time series measured from the network dynamics response, 
such as the membrane potential, to a natural input. In this 
approach, we do not design specific inputs to probe the spe- 
cific response of the system in order to uncover the network 
topology. 

Because most dynamical quantities are continuous in time, GC 
values are computed using the discrete time series sampled from 
these continuous time processes. In this work, we analyze sam- 
pling artifacts in the GC analysis. In particular, we examine the 
reliability of the I&F network topology reconstructed using the 
GC analysis, by studying behaviors of the GC value as a function 
of sampling interval length. We will term this function as the GC 
sampling structure. As will be shown below, surprisingly, there are 
oscillations in the GC sampling structure and the GC may even 
vanish on a set of finite sampling interval lengths even if there are 
physical couplings between neurons. The phenomenon of vanish- 
ing GC on a set of sampling interval lengths obviously complicates 
the interpretation of causal inference and raises the issue of the 
reliability of network topology reconstruction using the GC anal- 
ysis. In addition, if sampling interval is not sufficiently small, as 
mentioned above, GC may not vanish even when there is actually 
no causal influence (McCrorie and Chambers, 2006). Therefore, 
in general, without properly taking into account the sampling 
effect, different conclusions can be drawn about causality, due 
to the difference in sampling, when the GC analysis is applied 
to experimental observational data. One may presume that if we 
used discrete time series sampled using ever finer interval lengths 
from a continuous process in time, a more reliable GC value for 
causal inference could be obtained since the sampled time series 
in such case is more and more close to the continuous dynamical 
process. However, as we will show below, the GC value actually 
approaches zero linearly in proportion to the sampling interval 
length. Therefore, in the GC analysis for the network topology 



reconstruction, one of the main theoretical challenges is what is 
an appropriate sampling interval in order to obtain reliable GC 
values. 

To answer the question of how to determine the sampling 
interval, we first characterize various features associated with the 
GC sampling structures arising from the network topology recon- 
struction. We will use idealized linear models to illustrate possible 
mechanisms underlying these structures and to gain insight into 
reliability of the GC analysis applied to linear and non-linear 
systems. Finally, we will present general strategies to circumvent 
these complications in the GC analysis. It is important to point 
out that these sampling issues arise from how discrete time series 
for the GC analysis are sampled from continuous time processes 
and they need to be addressed regardless of whether one uses 
a parametric method or a non-parametric method for the GC 
evaluation (Geweke, 1982; Dhamala et al, 2008). 

This article is organized as follows. In the section of Methods, 
we briefly introduce the I&F network model, the GC analysis 
and its application in the network reconstruction. In the section 
of Results, first, we numerically study the GC sampling struc- 
tures for the I&F networks. We observe that GC values oscillate 
with respect to the sampling interval length r and approach zero 
linearly in proportion to r as t tends to zero. These two phe- 
nomena may give rise to complications for reliable GC analysis 
of the I&F network, in addition to the possible spurious GC in 
the absence of causal influence. Second, we study mechanisms 
of oscillations in GC sampling structures through linear mod- 
els. Third, we study the limiting behavior of GC as r tends to 
zero and discuss the underlying mechanisms. In the section of 
Discussion and Conclusion, we describe our approach of elimi- 
nating the sampling artifacts in the application of GC and present 
our conclusions. In Appendices A-C of Supplementary Material, 
we present related mathematical details. 

2. MATERIALS AND METHODS 
2.1. I&F NETWORK MODEL 

As we mentioned in the Introduction, we have applied GC anal- 
ysis to reconstruct the topological connectivity of I&F networks. 
Experimentally, it has been shown that models of I&F neurons 
are able to capture rather well the subthreshold dynamics of a real 
neuron in terms of linear response properties as well as statistical 
firing properties of a real neuron (Carandini et al, 1996; Rauch 
et al., 2003; Burkitt, 2006). The models of I&F neurons have been 
extensively used in simulations of large-scale neuronal networks 
in the modeling of cortical phenomena in the brain (Somers 
et al, 1995; Troyer et al, 1998; McLaughlin et al., 2000; Tao et al, 
2004; Cai et al, 2005; Rangan et al, 2005; Zhou et al, 2013a). In 
this work, we study the network of the conductance-based, I&F 
neurons of excitatory type and its dynamics is governed by the 
following equations: 

f dV, 

— ^ = -Gl(V, - 6l) - G,(V, - 6e) , 
at 

jji, k 

(1) 
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where V, and G, are the membrane potential and conductance 
for the rth neuron in the network, respectively, Gl is the leak 
conductance, 6l is the resting potential, eg is the reversal poten- 
tial, a is the conductance time constant. When V, is between the 
threshold Vth and the resting potential 6li the dynamics of a neu- 
ron is described by Equation (1). However, when the membrane 
potential of a neuron reaches the threshold Vth> it is reset to the 
resting potential 6l for a certain period r^ef (refractory period). 
An action potential of a real neuron is modeled here by the event 
of the threshold crossing and the voltage resetting. The event 
of this threshold-reset dynamics is referred to as a firing event 
(spike) of a neuron, s,-, is the connection strength from the presy- 
naptic neuron to the postsynaptic neuron i. For simplicity, we 
study a homogeneously coupled network, i.e., s,,- = sAj„ where s 
is the connection strength, A = (Ay,) is the adjacency matrix of 
the network. Tjj; is the time of the fcth spike of the jth neuron, 
whereas T^^ is the arrival time of the /th spike of the input to 
the ith neuron. The input is a Poisson process with rate v and 
strength k. 

In our work, we use dimensionless unit for membrane poten- 
tials, in particular, Vth = 1, = 0, eg = y. These correspond 
to unsealed physiological values, Vth = — 55mV, 6l = — 70mV, 
and 6e = 0 mV (McLaughlin et al, 2000; Cai et al, 2005; Rangan 
et al., 2005; Zhou et al., 2013a). Time constants retain its dimen- 
sion, for which we use ms for its unit. We set Tref = 2 ms and the 
conductance time constant cr = 2 ms. Conductance has the unit 
ms^^. We have Gl = 0.05 ms^^, which corresponds to the phys- 
iological membrane time constant 20 ms. The method we used 
to solve system (1) numerically is a fourth order Runge-Kutta 
method with spike-spike corrections, the details of which can be 
found in Rangan and Cai (2007). 

In addition to voltage signals, we also use time series con- 
structed from spike trains: 

+00 

sit) = J2s{t - r;), 

i = 1 

where, S{t — ) is the Dirac delta function, T'^ is the firing time of 
the ith spike. In order to avoid the infinity in the Dirac 5-function, 
we convolve these spikes with a kernel. The simplest kernel is 
the indicator function with width d. Then, a spike train can be 
represented as the following: 

S(f)= / ^3(r-rA/d(t - r)dr, (2) 

0 ■=! 

where, Idix) = 1 for \x\ < | and Idix) = 0 for \x\ > j. We use 
sufficiently small d so there are no overlaps between spikes. Then, 
we can obtain a time series of binary type by sampling signal (2) 
with a certain sampling interval length. 

In the following, we wUl perform the GC analysis on the time 
series of voltage or spike trains to reconstruct the I&F network 
topology, i.e., the adjacency matrix A = (Aj,) and discuss the 
sampling issues in this approach. 



2.2. BRIEF INTRODUCTION TO THE GC ANALYSIS 

The GC analysis is based on the idea that, if one can reduce the 
prediction error about a time series after incorporating the history 
of the second time series, then there must be a causal influence 
of the second time series on the first (Wiener, 1956; Granger, 
1969; Geweke, 1982). The GC analysis can be naturally formu- 
lated in a bivariate form. However, the causality analysis can also 
be naturally generalized to a multivariate form using conditional 
Granger causality (Geweke, 1984; Ding et al., 2006). In the follow- 
ing, for ease of later discussions, we briefly recapitulate properties 
of Granger causality in the bivariate setting. 

For two stationary time series X; and Yt, one can perform 
autoregression to obtain 

= T,fL 1 aijXt-j + €u, , , 

where eu and 77 if are autoregression residuals representing the 
prediction errors when we consider the history of each time 
series separately. Here, Ei = var(6it), Fi = var();if). The corre- 
sponding joint regression has the following form for times series 
X, and Y, 

[Yt =j:r=i'^2jXr-j + j:r=id2jY,-j + ri2t, 

where 62t and iiit are joint regression residuals representing the 
prediction errors after we consider a shared history for both time 
series. The covariance matrix for e2t and ii2t is denoted by 

LT2 Fz. 

where E2 = var(62f), r2 = var(?;2f), T2 = cov(62t, ?72t)- By the 
Gauss-Markov theorem (Scheffe, 1959), the series {en}, {€2t], 
{rjit}, and {J72f} are white noise time series. It is evident from 
Equations (3) and (4) that E2 ^ Si and F2 ^ Fi, that is, one can 
never obtain a worse prediction (i.e., greater residual variance) of 
one time series after incorporating information from the other 
time series. 

Following the idea of Granger causality, if E2 = Ei, i.e., the 
prediction error cannot be reduced by the joint regression, then 
there is no causal influence from Yt to Xf. However, if E2 < Ei, 
there is a causal influence from Yt to Xt. This causal influence is 
characterized by the Granger causality which is defined as 

F^^, = ln^. (5) 

^2 

Clearly, Fy^^ ^ 0 because E2 ^ Ei; Py-^^ = 0 when E2 = Ei. 
Similarly, Granger causality from {Xt] to {Yt\ is defined as 

F,^y = \rJ-^. (6) 
1 2 

The idea of instantaneous causality is to quantify the mutual 
instantaneous influence of both time series without any time 
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lag. If T2 = 0, i.e., 62f and ri2t are uncorrelated, then incorpo- 
ration of the new instantaneous information of Xt (e2t) cannot 
help to reduce the variance of ri2t and vice versa. Therefore, the 
instantaneous causality can be defined as 



•-x-y 



In 



£2^2 



(7) 



where 1 2 1 is the determinant of the matrix Y, . Note that, when 
T2 = 0, |E| = r2E2 - T| = r2i;2, therefore f^.y vanishes, i.e., 
no instantaneous causality. The total Granger causality between 
{Xt} and {Yt} can now be defined as 



In- 



which can easily be seen to be the sum of all the three causality 
terms Equations (5), (6), and (7): 



Py^x ~f~ y ~f~ Px ■ 



(8) 



Note that, in the GC analysis of our network topology recon- 
struction, we mainly focus on the values of Fy^^ and F^^y 
because they quantify the directional causal strengths. As can 
be seen below, Fy^^ and F^^y can be related to the directional 
connectivity of a network (Zhou et al., 2013b, 2014). 

For our later discussions, in addition to the time-domain 
description of GC, we also briefly summarize the description 
of GC in the frequency domain (Geweke, 1982; Ding et al, 
2006). For the bivariate time series Xt, Yt, the spectral matrix 

Sxxi^o) Sxyiw) 



S{cl)) is represented as S{co) 



SyAa>) Syyiai) 



where Sxx((w) 



is the auto-spectrum of the time series Xt, defined by Sxx(«) = 
'^'^^_^cov{Xt,Xt-„)e^'"'^, Syyici)) is the auto-spectrum of 
the time series Yt, Sxyico) is the cross-spectrum of Xt and 
Y„ defined by S^yiw) = Et=-oc cov(X(, Yt-„)e-'"'^. Note that 



Syxico) is the complex conjugate of Sxyico). The covariances 
of residuals possess the following spectral representations [see 
(Geweke, 1982; Ding et al, 2006) for details]: 



InS, 



InTi 



InlSI 



^ j lnSxx((^)dw, 

— JT 
JT 

1 f 

— / lnSvv(tt<)dii), 
2jr J 



TV 

27t J 



ln|S(w)|dw, 



•-x.y 



Syy{M)Sxxi0}) 
|S(«)| 



dw. 



(9) 



(10) 



(11) 



(12) 



where |S(tt')| = Sxxi'»)Syy{a)) 
of the spectral matrix S(tt>). 



■ Sxy(a>)Syx{co) is the determinant 



2.3. EXTRACTION OF THE NETWORK TOPOLOGY 

We now turn to the reconstruction of the network topology using 
the GC analysis. For a two-neuron network as shown in Figure lA 
in which neuron x is postsynaptic to neuron y whereas there is 
no synaptic input from neuron x to neuron y, the corresponding 
adjacency matrix of the network is shown in Figure IB. Applying 
the GC analysis on voltage time series with sampling interval 
length T = 0.5 ms, time length T = 10^ ms and regression order 



p = 20, we obtain Fy- 



8.3 X 10"* and Fx- 



: 0.1 X 10" 



(Figure IC). Note that, in our work, the Bayesian information 
criterion (Schwarz, 1978) is used to determine the regression 
order p. In our example in Figure 1, we have Fy^x ^ Fx-^y and 
the GC value from neuron y to neuron x is almost two orders 
of magnitude larger than the inverse direction. From this, one 
may conclude that GC values are linked to the adjacency matrix 
of the network, i.e., Ayx = 1 and Axy = 0. This is also consis- 
tent with the result by performing statistical significance test (see 
Appendix C in Supplementary Material for details) (Zhou et al., 
2013b, 2014). In the present work, we focus on the implication 
of sampling effects and on the assessment of the reliability of this 
reconstruction by the GC analysis. 

3. RESULTS 

3.1. EFFECTS OF SAMPLING 

As discussed in the Introduction, the GC analysis has emerged 
as a popular tool in detecting causal relations among times series 
of physiological data measured in the brain. Note that most of 
the dynamical quantities underlying these times series are con- 
tinuous in time, therefore, a natural and important issue arises 
about what is the proper way of constructing discrete time series 
from continuous quantities when applying the GC analysis. To 
address this issue, we examine the following two related ques- 
tions: one is whether it makes great differences when discrete time 
series sampled at different time intervals are used and the other 
is whether the GC analysis would become more reliable as the 
sampling interval length shrinks to zero. We will answer these 
two questions using our I&F network dynamics below. Due to 
its own importance, as mentioned previously, we will refer to the 
GC value as a function of sampling interval length r as the GC 
sampling structure. 

3. 1. 1. Oscillations in the GC sampling structure 

In the topology extraction for the network dynamics of two 
neurons (see Methods) with unidirectional interactions, whose 



B 




FIGURE 1 I (A) Schematic representation of a unidirectional two-neuron 
network with only neuron y synaptically connected to the postsynaptic 
neuron x. (B) The corresponding adjacency matrix A with Ayx = 1 for the 
network in (A). (C) The grayscale grid representation of the GC matrix F 
[Fjj = Fj^j) for the network. The gray scale bar indicates the value of GC. 
The parameters of the network are u = 1 ms"^ , X = 0.012, and s = 0.02. 
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topology is shown in Figure lA, we perform the GC analysis on 
both voltage and spike train signals of the neurons. In order to 
examine possible differences of GC values with different sampling 
intervals, we first perform a scan of interval lengths in sampling 
of the dynamical variables from the I&F network. Note that we 
evaluate the GC values with different sampling interval lengths 
based on the same original signal, i.e., the same total duration of 
the signal. 

The scanning result is summarized in Figure 2A, which dis- 
plays the GC value as a function of sampling interval length r for 
the I&F dynamics. From Figure 2A, for GC values obtained from 
both the voltage and spike train time series, we have the follow- 
ing Observations: (i) there are oscillations in the GC sampling 
structure, (ii) the GC value becomes almost zero periodically, 
and (iii) the GC value oscillates almost at a constant frequency 
(approximately 500 Hz in Figure 2A), (iv) the GC value decays 
to zero as r becomes sufficiently large. It is easy to appreci- 
ate Observation (iv) because for time series with a sufficiently 
large sampling interval length, much of the information is lost, 
resulting in unreliable causal detection, i.e., vanishing GC val- 
ues. These observations demonstrate that for I&F dynamics there 
are great effects of sampling interval length on the GC val- 
ues regardless of whether the voltage or spike train time series 
are used. 

In general, there are oscillations, with a narrow spectral band 
or a broad spectral band, in time series from the I&F networks. It 
is natural to use the frequency domain to examine these features 
of the time series. We ask whether there is a relation between the 
oscillations of time series which can be revealed by the spectra and 



the GC sampling structure. In Figure 2C, the peak frequencies 
of all spectra Sxx, \ Sxy \ and Syy are approximately 250 Hz which 
is half of the oscillation frequency 500 Hz in the GC sampling 
structure in Figure 2A. In the following, we will investigate this 
relation. 

From the results of Figure 2A, a question naturally arises: 
whether the oscillation phenomenon is general and is not con- 
fined to the time series from the non-linear dynamics of the 
I&F networks. In particular, we ask whether the oscillations 
exist for linear autoregressive models, for which the GC frame- 
work is established. In order to answer this question, we study a 
simple example of linear models. A representative second-order 
autoregressive model we study is 

X, = 0.9Xf_ 1 - 0.6Xf_2 - 0.37,_ i + 0.157,_2 + fr, , 
Yt = 0.7Yt-i-0AYt^2 + rit, 

where 6t and rjt are Gaussian white noise time series with vari- 
ances var(6() = 0.5, var(?jt) = 1 and covariance cov(6f, rjt) = 
0.2. Figure 2B displays the GC values computed using the time 
series generated by the model (13) with different "sampling inter- 
val length" r = k, i.e., we use every datum point after skipping 
k — I points in between. From Figure 2B, we can also observe 
the oscillatory feature in the GC sampling structure, which also 
exhibits the behavior of GC that almost periodically vanishes. 
Note that we have examined many other parameters for second- 
order autoregressive models and found that the oscillatory fea- 
tures are quite common, as wiU be addressed below. In Figure 2D, 
the peak frequencies of all spectra for this model are approxi- 
mately i which is again half of the oscillation frequency j in the 
GC sampling structure in Figure 2B. 

In summary, it can be seen that the oscillations of the GC val- 
ues with respect to sampling interval length is a common feature, 
not only for I&F networks but also for linear models. These fea- 
tures in the GC sampling structure have strong ramifications in 
the application of the GC analysis. Importantly, it is disconcert- 
ing to observe that the GC value almost become zero at some 
sampling interval lengths despite the fact that there are clearly 
physical connections in the network or the coupling in the lin- 
ear model between the two time series. We further note that, in 
Figure 2A, Fx^y from both voltage and spike train time series 
displays non-zero values for some ranges of r, in contrast to the 
physical couplings in the system for which there is no synap- 
tic connection from neuron x to y. As will be shown below, 
this spurious GC also arises for linear systems. Evidently, the 
phenomena of vanishing GC and spurious GC will greatly com- 
plicate the reliability of any conclusions about the causal influence 
and will make the validity of the network topology reconstruc- 
tion questionable. Later, we will return to the resolution of 
these issues. 

3. 1.2. The GC sampling structure as t->- 0 

As demonstrated through the oscillation phenomenon of GC 
above, clearly, we cannot choose sampling interval length arbi- 
trarily in the application of GC analysis. Then, an important issue 
arises: what should be the criteria of choosing a correct sampling 
interval length to obtain discrete time series in measurement for 




250 500 750 
f(Hz) 



0.2 0.3 0.4 0.5 
f 



FIGURE 2 I The GC sampling structure and corresponding spectra. The 

top row of panel is GC vs. sampling interval length; (A) Fx^y (red), Fy^x 
(cyan) obtained from voltage time series and Fx^y (red dash), Fy^x (cyan 
dash) obtained from spike train time series with sampling interval t. The 
time series are generated by the I&F network whose topology is shown in 
Figure 1A with parameters v = 1 ms"^ , k = 0.066, s = 0.02. (B) F^^y 
(red) and Fy^x (cyan) with sampling interval length k (see text) for the 
second order autoregressive model (13). (C) and (D): Corresponding 
spectra of the voltage time series for (A) and the autoregressive time 
series for (B), respectively: Sxx (cyan), Syy (red), \Sxy\ (black). 



Frontiers in Computational Neuroscience 



www.frontlersin.org 



July 2014 I Volume 8 | Article 75 | 5 



Zhou et al. 



Sampling artifacts on tlie Granger causality analysis 



reliable GC analysis. For a natural discrete-time dynamics, we can 
simply use their intrinsic intervals, e.g., for discrete time series 
generated by an autoregressive model, and we can then obtain 
reliable GC values for causal interpretations because the entire 
information is incorporated in the causal analysis. However, most 
physical quantities are continuous in time, one does not have 
particular intrinsic intervals for sampling. In order to obtain reli- 
able GC values, one possible scenario, similar to the discrete case, 
is that it is always better if one uses more finely sampled time 
series because apparently more information is incorporated for 
causality determination. To examine this scenario, we study the 
convergence property of the GC sampling structure as the sam- 
pling interval length r tends to 0. The corresponding numerical 
results are shown in Figure 3 for a two-neuron I&F network. 

From Figure 3, we observe that for both voltage and spike train 
time series, GC values approach 0 almost linearly as the sampling 
interval length r tends to 0. This phenomenon of vanishing GC 
values implies that we eventually fail to extract real causal rela- 
tion through GC analysis when sampling interval length vanishes 
regardless of whether or not there are interactions between two 
time series. In this situation, obviously, we cannot use the limit 
GC value as r ^ 0 to infer the causal interaction especially for 
the I&F networks. Here, we note that this limit effect of sam- 
pling is general for continuous-time processes and the underlying 
mechanism wiU be demonstrated later. 

In summary, the phenomena shown in Figures 2, 3 demon- 
strate that time series obtained by using different sampling inter- 
val lengths lead to great differences in GC values and a finer 
sampling does not result in more reliable GC values for causal 
interpretation. Therefore, in order to obtain meaningful inter- 
pretation through GC, how to choose a sampling interval length 
becomes an important issue in practice. Without properly taking 



this issue into account, GC analysis may produce unreliable or 
even opposite conclusions about causal influence when applied to 
empirical data sampled with different sampling interval lengths. 
In what follows, we first use linear regressive models and spectral 
methods to study theoretically possible mechanisms underlying 
the oscillation phenomenon. With these idealized models we can 
gain insight into possible origins of general oscillations in the GC 
sampling structure. 

3.2. MECHANISM OF OSCILLATIONS IN THE GC SAMPLING 

STRUCTURE 
3.2. 1. Spectra and the GC sampling structure 

We will invoke the frequency domain description of GC for 
our analysis of the GC sampling structure. First, we discuss 
the relation between the spectrum Sico) for the time series 
{Xq , Xi, X2, ■ ■ ■} whose sampling time interval is r , and spectrum 
S(*'(w) for time series {Xo,Xi^, X2k, ■ ■ ■}, which is a time series 
sampled at the interval length kr, k being a positive integer. Using 
the Wiener- Khinchin theorem, we can relate the spectra Sico) and 
S'^*'(c(j) of these two time series to their time correlations, thus 
enabling us to show that 

} = o 

The details of the proof can be found in Appendix A in 
Supplementary Material. Note that Equation (14) is also valid for 
cross-spectrum Sxyico) of time series Xt and Yt, which is defined 
by S^yica) = Et=-oo cov(Xf , y,_ Je" 

A direct implication of Equation (14) is that if S(&)) = C (C is 
a constant), then S'*^' (co) = C. That is, if the original time series is 
white, then the time series obtained by sampling with any inter- 
val length remains to be white. In addition, we have the following 
results about GC in the case in which one of the bivariate time 
series is white and the other has no correlation with it. If Yt, t 
is an integer, is a white noise series and cov(yf, Xf _ ,) = 0 for 
any positive integer i > 0, then Px%y = 0. In particular, when 
cov(yf, Xi^i) = 0 for any integer ^ 0, Fi%y = F^^y = 0. These 
facts conform with our intuition that there is no causal flow if 
there is no correlation between Xt and Yt and the Yt remains 
always white (see Appendix A in Supplementary Material for 
details). We will use these facts in our discussions below. 

We now turn to the question of what is the structure in time 
series that will give rise to oscillations in the GC sampling struc- 
ture. A possible origin of oscillations in the GC sampling structure 
may be a consequence of oscillations in the original time series. 
Our strategy of tackling this question is as follows: First, we note 
that the spectrum of the original time series characterizes the 
oscillation strength of the time series over different frequencies. 
We will construct simplified models in which we can focus on 
the relationship between the peak frequencies of the spectrum 
and oscillations in the GC sampling structure. We wiU construct 
different classes of models which allow us to directly analyze the 
GC sampling structure through the spectral representation (14). 
Using Equation (14), we can derive an expression of the spectral 



,x 10 




FIGURE 3 I The GC sampling structure as sampling interval length 
tends to zero. F^^y (red), Fy^x (cyan) obtained from voltage time series 
and Fx^y (red dash), Fy^x (cyan dash) obtained from spike train time series 
with sampling interval length r. Note that, by the asymptotic distribution 
theory of GC (Geweke, 1982), the estimator of a directed GC has a bias £, 
where p is the regression order and n is the length of the discrete time 
series. We have used the Bayesian information criterion (Schwarz, 1978) to 
determine the regression order p and have subtracted this type of biases in 
the figure (see Appendix C in Supplementary Material for details). The time 
series are generated by the I&F network whose topology is shown in 
Figure 1A with parameters v = 1 ms"^ , X = 0.0177, s = 0.02. 



Frontiers in Computational Neuroscience 



www.frontiersin.org 



July 2014 I Volume 8 | Article 75 | 6 



Zhou et al. 



Sampling artifacts on the Granger causality analysis 



matrix S'*^' with sampling interval k, S^^^ 



Ak) Jk) 

Ak) Ak) 
Jyx '->yy 



.andSW 



can be directly used to compute the total causality F^j, through 
Equation (12) as follows 



p(k) . 
x,y 



1 r 



In 1 



dft). 



(15) 



Because the total causality in Equation (8) involves three different 

terms and it is usually difficult to derive the analytical formula 
for each term, one way of tackling this issue is to consider each 
term separately. In particular, we construct a special model for 



which we have F- 



(k) 



F^^j = 0. Then, the directional GC FyUx 
ik) 



is identical to the total causality j, and we can therefore analyze 
the GC sampling structure Fj,J^x through the total causality in 
Equation (15). If the condition S^'(&))s|y (w) JS> S^xy{w)S^yJici)) 
holds, we can obtain the following simplified approximation for 
the directional GC f[ 



(k) 



w(k) 



1 

277 J-K 



2^ J-n si^'(a))S}^'(«) 



do). 



(16) 



(16) to analyze the relation between F'^yX^ and samphng interval 
length k. 

To illustrate clearly the sampling phenomena, we discuss two 
classes of a(L) and i)(L) in the following. 

Forthe first class, we fix S;cx = a(co)a*{co)ii + b(co)b*{co) = C, 
where C is constant and we always assume C ^ max {b(w)b* (a>)) 
so as to guarantee that Equation (16) is a good approximation. 
Under this condition, there are no oscillations in the time series 
Xf as discussed previously and the only spectrum that varies over 
k is the cross-spectrum Sxy. Then, we can find explicit asymp- 

(k) 

totic expressions and approximations of fy-t^ in order to study 
the relation between the peak frequency of Sxy{a>) and the oscilla- 
tions of F^x with respect to the sampling interval length k. The 
spectrum Sxy corresponds to the Fourier transform of the corre- 
lation function between Xt and Yf. In applications, it is common 
that a correlation function has damping and oscillation struc- 
tures. Therefore, we wiU study the foUovnng three cases in the first 
class: 



Case 1.1 b(L) = J^f^i e'^'''!^ 
Case 1.2 b(L) = ^ 
Case 1.3 b(L) = J2j=i 



\-oo ^— 

: 1 



In the following, we will use simplified models to reveal the ori- 
gin of oscillations in the GC sampling structure, thus, providing 
insight into an intuitive understanding of the possible origin of 
oscillations in the GC sampling structure for the I&F network as 
well as for other general situations. 

3.2.2. Idealized model I 

The first idealized linear model we have constructed is as follows 



X, =a{L)€t + b{L)tit, 



(17) 



Here, td, P and 0 are constants. We note that there are no oscilla- 
tions in b{L) in Case 1.1, and there are oscillations in biL) in Cases 
1.2 and 1.3 and with a further phase shift in Case 1.3. 

For the second class, we do not fix Sxxico) as a constant, i.e., the 
time series Xf is no longer white. Instead, we study some special 
combinations of a(L) and b{L). For this class, we cannot derive 
explicit analytic forms of FyXx- Therefore, we turn to GC values 
obtained numerically through Equations (14) and (15) to study 
the relation between the oscillations of F^x and the peak fre- 
quencies of Sxxico) and | Sxy{a>) | . We will study the following cases 
in this class: 



where [et] and {/jt} are independent white noise series with 
> 0" 



covariance matrix X 



0 1 



L is the lag operator satis- 



fying LXt = Xt-i. a{L) = l+aiL + a2L^ -\ , b{L) = b\L + 

b2L^ We can obtain the spectra of Xt, Yt, which are 

Sxx = a{a>)a*{co)fji, + b{co)b*(a)). 



1, 



i.e., Yf is a white noise time series, and 



^xy ■ 



Hco), 



where a(«) = 1 + E^=i and b{co) = E^=i 

which are the Fourier transforms of a(L) and b{L), respectively. 
By the result of the previous section (also see Appendix A in 



Supplementary Material for details), we have Fl 



(k) 
x-y 



Ak) 



: Ofor 



this model, therefore, the other directional GC is equal to the total 
causality, i.e., F^x = Fx^y- Then, we can use Equations (15) and 



Case 2.1 a(L) = J2j=o e-^'"cos{fiij)V, b{L) = "£1=1 e"^''-'^' 
Case 2.2 o(I) = Y,j'=o e"'''-'cos(/Sij)L'', b{L) = 



n 



Here, Xd, Pi, fii are constants. We note that in Cases 2.1 and 2.2 
there are oscillations in the time series in Xt with a non-osciUatory 
b{L) in Case 2.1 and with an oscillatory h{t) in Case 2.2. 

We now turn to the detailed discussion of the properties of 
f'^Xx in all the hsted cases. 

Case 1.1 We consider the case b{L) = J^J'^i e-^'^V which 
has no oscillations in b(L) and, therefore, the peak frequency of 
|5xy(ft))| is at 0. We will examine whether F'^yXx oscillates with 
respect to k. From Equation (16), Under Approximation \: C"^ 
b^{(o)l^{co)* , we can derive the following asymptotic result (see 
Appendix A in Supplementary Material for details) 



p(k) 



c ' 



(18) 
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for which, we can clearly see that GC decreases exponentially with 
respect to k. 

In Figure 4A, which displays spectrum \Sxy\, we can observe 
that the magnitude of Sxy concentrates near 0 which signifies 
that there are no oscillations induced by the coupling b{L). In 
Figure 4D, which displays the GC sampling structure for this 
case, clearly, there are no oscillations. It can also be seen that 
GC obtained through the asymptotic result, [Equation (2) in 
Supplementary Material], agrees very well with the numerically 
obtained GC for all k's and the asymptotic formula Equation (18) 
approximates GC rather well when k is large. Therefore, we can 
conclude that if there are no oscillations in the coupling b(L), as 
signified in the peak frequency of | Sxy \ near 0 and if Sxx and S, 



is located approximately 



are constant (i.e., both are white), then Fy. 
with respect to k . 



ik) 



Jyy 

does not oscillate 



Case 1.2 From Case 1.1 in Model I, no oscillations in b{L) 
implies no oscillations in the GC sampling structure. Next, we 
consider b{L) = e^^'''c.os{Pj)V to examine whether oscil- 

lations in the coupling b(L) may induce oscillations in the GC 
sampling structure. 

Under Approximation II, i.e., large r and k, we can obtain 
the following asymptotic expression through Equation (16) (see 
Appendix A in Supplementary Material for details) 



■xy\ 

, which faithfully reflects the oscillation frequency 



In Figure 4B, the peak of \Sxyico 
at/=/ 

(/o = ^ = j^) of fc(L). In Figure 4E, the oscillation frequency 

/g of Fy%x is approximately g, which is twice of /q. In addi- 
tion, the curve obtained by using Approximation I [Equation 
(5) in Supplementary Material] overlaps with the numerically 
computed curve, and for large k, the curve obtained by using 
Approximation II, Equation (19), fits well with the numeri- 
cally computed curve. Therefore, from Equation (19), we can 
conclude that oscillations at frequency fo in the coupling b{L), 
manifested as the peak frequency / = /q = ^ in I S^y I , implies 



double frequency oscillations (/q : 
structure. 



2/o = ^ ) in the GC sampling 



Case 1.3 By the study of Case 1.2, we can further ask how the 
phase of oscillations in b{L) regulates the oscillations in the GC 
sampling structure. We consider oscillations with phase </> in b{L), 
that is, h{L) = e^^'i^ cos(Pj + <p)V . Under Approximation 

II, we obtain following asymptotic form through Equation (16) 
(see Appendix A in Supplementary Material for details) 



c 



g-2rrffCcos2(^fc-F0). 



(20) 



^ ~ c ( 



e 4rrffc + g-2 



^■i^cos^pk^ 



^9) For a special case, when </> = — Equation (20) becomes 



from which we can observe that there are oscillations in the GC 
sampling structure. 



dk) 



c 



-2rjk-„2 



sin^jSfc. 



(21) 
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FIGURE 4 I Contrast between spectra and GC sampling structures for 
the first class. Magnitude of S^y {\Sxy\) vs. frequency f {f = ^) (black) for 
(A) Case 1.1. (B) Case 1.2. (C) Case 1.3. (D) Comparison of fJ.'^U „ obtained 
through numerical solution Equation (15) (blue), asymptotic expressions 
Equation (2) in Supplementary Material (red plus) and Equation (18) (green 
cross) vs. sampling interval k for Case 1.1. (E) Comparison of fJ,*^ ^ obtained 
through numerical solution Equation (15) (blue), Equation (5) in 



Supplementary Material (red plus) and Equation (19) (green cross) vs. 
sampling interval k for Case 1.2. (F) Comparison of F^'^L x obtained through 
numerical solution Equation (15) (blue). Equation (9) in Supplementary 
Material (red plus) and Equation (21 ) (green cross) vs. sampling interval k for 
Case 1.3. Insets are corresponding log-linear plots of GC vs. sampling interval 
length k. The exponential decay is clearly seen in the insets. The parameters 
are C=10*, = 0.05, and /J = J. 
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In Figure 4C, which depicts the spectrum of Sxy, the peak fre- 
quency of I Sxy I is the same as the peak frequency / = in 
Figure 4B, and this reflects the same frequency of the oscillations 
in the coupling term b{L). In Figure 4F, the oscillation frequency 

/q of Fy%x is again twice of the peak frequency of \Sxy\ [as can 
be seen in Equation (21)]. In addition, the GC sampling structure 
via the approximation of Equation (9) in Supplementary Material 
is in excellent agreement with numerically solved GC sampling 
structure for all Fs. For large k, the asymptotic expression of 
Equation (21) fits well with numerically solved GC sampling 
structure. However, in contrast to Case 1.2, there is a shift in k 
for the points at which Fy%x attains minimum (or maximum). 
Therefore, from Equation (20), we can conclude that the phase 
factor (p in HL) does not affect the frequency of Fy^^ but it 
regulates the phase of the oscillations, i.e., it changes the loca- 
tions of the extrema of Fy%x depending on (p. We also note that, 
in the present case, the minimum value of -Fy^x in oscillations 
approaches 0. 

Case 2.1 and Case 2.2 For the first class we studied above, we 
have («) = C and sj^' (ai) = 1, which do not contribute to the 

fit) 

variations of -Fy-4.x- Then, we obtain a direct relation between the 
cross-spectrum S^yiui) and the GC sampling structure. However, 
because Sxx(w) often has peaks at some frequencies especially 
for time series in neural systems, we further examine the role of 
Sxx{(^) [or Syy(u))] in the oscillations of the GC sampling struc- 
ture. Therefore, we turn to the study of the second class. In order 
to examine how the oscillation of the causality depends on the 
oscillations of a{L) and b{L), we examine the large /x limit, and 
thus, the location of frequency in the peak of Sxx((>^) is determined 
by a(L), whereas the peak of Sxy(w) is determined by h(L). 



First, for a(L) = E/~o e-''*^cos(^i;)L-', b{L) = J2t=i ^''""Li, 
from which there are no oscillations in the coupling b{L), we 
study the behavior of Fy^^ to look for the relation between 
the oscillations in Fy'-ix and oscillations in a(L). In Figure 5A, 
which displays spectra of S^x and \Sxy\, the peak of Sxx is located 
approximately at / = | , which reflects the oscillation frequency 
in a{L) and \Sxy\ concentrates near 0, which reflects the non- 
oscillatory nature of b{L). In Figure 5D, which displays the cor- 
responding GC sampling structure, the oscillation frequency /o 
in the GC sampling structure is approximately i, which equals 
to the peak frequency of Sxx as well as the oscillation frequency 
in a(L). In this case, we conclude that Sxx with a peak at fo 
can induce oscillations with frequency fy in the GC sampling 
structure. 

Second, for the case of a{L) = X!;-t^ e^^'^-'cos(/Jij)L', and 

b{L) = X!jt?i e^^''-'cos(/^2i)i') in which both have oscillations 
with frequency jSi and jii, respectively, we investigate the inter- 
actions of different oscillations in a{L) and b{L) in inducing the 
oscillations in Fy%x- From the studies above, one may suspect that 
the oscillations in Fy%x should be a combination of frequency 
^ and ^ , implying that we may encounter complicated oscilla- 
tion structures for Fy%x with frequency consisting of ^ and ^ . 
Shown in Figures 5E,F are such cases. In Figure 5B, the peak fre- 
quencies of Sxx and \Sxy\ meet the oscillation frequency of a{L) 
(|^ = Yj) and b{L) (^ = |), respectively. The GC sampling 

(k) 

Structure is shown in Figure 5E, where -Fi.y oscillates at frequency 
i which is twice of ^ and is three times of ^ . In Figure 5C, the 
peak frequencies of Sxx and \Sxy\ meet the oscillation frequency 
of a(L) ij^ = 1) and biL) {j^ = jg), respectively The GC 
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FIGURE 5 I Contrast between spectra and the GC sampling structure for 
the second class. The top row of the panel is spectra Sxx (cyan) and \Sxy\ 
(black), which is three orders of magnitude smaller than that of Sxx and is 
magnified by a factor 10'^, vs. frequency f, (A) Case 2.1 with parameters 



■ 4 . = 

0.05, 11 
0.02, M 



0.02, M = 200. (B) Case 2.2 with parameters /i, = g, /j2 : 
= 200. (C) Case 2.2 with parameters /3, = 5, ^2 = 
= 200. (D-F) the corresponding log-linear plots of fJ.'^L x 



obtained numerically (blue) with respect to sampling interval length k. 
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sampling structure is displayed in Figure 5F, in which the oscilla- 
tion features are complicated, making it difficult to discern simple 
frequencies. Therefore, the results in Figure 5 confirm that both 
oscillations of a{L) and b(L) can contribute to the oscillations in 
the GC sampling structure whose oscillation frequency can be a 
combination of those in a(L) and b(L). 

3.2.3. Idealized model II 

In Model I, there is no oscillation in the white noise time series 
Yf. For Case 1, the oscillations in the GC sampling structure are 
related to the peak in Sxy, whereas, for Case 2, the GC oscillations 
are related to the interplay between peaks in Sxy and Sxx- However, 
in the reconstruction of the topology of the simple two-neuron 
network, as shown in Figure 2C, there are peaks at approximately 
the same frequency in Sxx, \Sxy\-> Syy, and this frequency is half 
of the oscillation frequency in the GC sampling structure. We 
consider the following linear model that possesses the same prop- 
erties as those in the I&F network case, i.e., Sx. 
possess a peak at the same frequency: 



■Jxy\ 



, and Syy all 



Xt 
Yt 



a{L)€, + b(L)r]t, 
c{L)rit, 



(22) 



A 150 



B150p 




FIGURE 6 I Contrast between spectra and the GC sampling structure 
obtained numerically for Model II. Sxx (cyan), \Sxy\ (black), Syy (red) witli 
parameters = 0.05, ^ = |, ;/ = 0.4, (A)0 = |, (B)0 = ^. Fx^ y (red), 
Fy^x (cyan), Fx,y (red dash), Fx.y (cyan dash) with respect to sampling 
interval length k with the parameters (C) same as the parameters in (A). (D) 
same as the parameters in (B). 



where the covariance matrix £ 
6(, rjt IS E 



Syy = C((W)C*(tt)), 

cp)u, c{L) = j:+ 



0 1_ 
and 

:0' 



for the white noise time series 
Then, Sxx = aia))a*iw) + b{co)b*(a)). 



■Jxy 



b(co)c*(cL>). In this case, we set 
cos{fij)V, b{L) = y e-'dJcosiPj + 

^''■1 cos{pj)U , where r^, fi, y, 0 are con- 



stants. Note that this model is different from Idealized Model I in 



(k) 



^F^andF^y^ 



x-y 



0. 



that we no longer have F', 

From the spectrum plots shown in Figures 6A,B, it can be 
seen that the peak frequencies of all spectra are approximately 
j2 ■ Figures 6C,D display the corresponding GC sampling struc- 
ture. We can observe that the oscillation frequencies in the GC 
sampling structure are approximately g, which is again twice 
of the peak frequency in Figures 6A,B. Therefore, for both the 
linear model (22) and the I&F networks, there is the same rela- 
tion between the frequency in the GC sampling structure and 
that of the peaks in the spectra as those shown in Figure 2. 
We note that the phase difference 0 between b(L) and a(L) [or 
c{L)] may lead to a rather different behavior of the instantaneous 
Granger causality, Fx.y, as demonstrated in the two cases shown 
in Figures 6C,D (0 = j in Figure 6C and cp = ^ in Figure 6D). 
Note that, by construction, there is no causal influence from Xt to 
Yt in Equation (22). However, Fx^y in Figures 6C,D show signifi- 
cantly non-zero values for ranges of k. Clearly, these GC values are 
spurious and potentially give rise to erroneous causal inferences 
even in this linear setting. As pointed out previously, this spurious 
GC phenomenon also occurs in the I&F neuronal networks (see 
Figure 2A). 

In summary, from the results above, we see that the oscillations 
in the GC sampling structure are related to the oscillations in the 
original time series and in the coupling of the time series regard- 
less of their linear or non-linear nature. For special cases, there is 
a doubling frequency in the GC oscillations in comparison with 
the peak frequency in spectra. However, in general, there may 



not be a simple relation between these oscillation frequencies. 
Nevertheless, we can infer that, in general, all oscillations man- 
ifested as peaks of Sxx, \ Sxy I and Syy may give rise to oscillations 
in the GC sampling structure. 

3.3. VANISHING GC AS 

We have already observed in Figure 3 that the GC value vanishes 
as the sampling interval length r tends to 0 for GC obtained from 
both voltage and spike train time series. Moreover, we note that 
Figure 3 also exhibits that the GC value approaches 0 in a man- 
ner almost linearly proportional to the sampling interval length 
T as T — ^ 0. As mentioned before, this sampling effect gives rise 
to the paradox that GC analysis seems to be less reliable as more 
information is incorporated as r — ^ 0. In the following, we will 
study the mechanisms of such effect and address the question of 
extracting reliable causal interaction as r — 0 in order to resolve 
this paradox. 

Because GC can be analyzed through the spectra, we first 
study the limit of the spectral matrix S'^'(ct)), which is the spec- 
trum of the time series with sampling interval length r. Suppose 
that we perform a uniform sampling of a bivariate continuous- 
time stationary process Xt and Yt with sampling interval length 
r to obtain X„r , Y„j , where n is an integer. Using the Wiener- 
Khinchin theorem, through the relation between the time corre- 
lation of discrete time series and that of the original continuous 
signal, we can obtain the relation between the corresponding 
spectral matrix S^'^Hco) and the power spectral density P(/) (w = 
Ijirf) of the continuous-time processes Xt and Yt as follows (see 
Appendix B in Supplementary Material for details): 



S^x^ico) 



Pxxif) Pxyif) 
Pyxif) Pyyif) 



(23) 



as T — ^ 0. Therefore, S'^''(tti) is scaled as as r 
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Using Equation (12), total Granger causality F^^J with sam- 
pling interval length r can be directly computed using each 
component of spectral matrix S''^' (w), which is 



2Jt J-„ \ s'^ho^)S%\w) I 



(24) 



Taking the limit of r — ^ 0 and using Equation (23), we obtain 



/+0O 
ln[l-C(/)]d/ (25) 
-OO 



is the coherence of 



as r^O, where C(/) = 

continuous-time processes Xf, Yf. 

Because a spectral matrix possesses the property of factoriza- 
tion (Wilson, 1972), i.e., S{co) = A(e'")A*(e'"), where * denotes 
matrix adjoint, the factorization of the spectral matrix gives rise 
to the decomposition S('^'('w) = H'^^Kco)T.'-^^1i'-^''(co)* of spectral 
matrix S'^'(w), where H(^)(<m) = A<"'(e"'')A<'■'(0)"^ and 



the covariance matrix S'^^ = A('^)(0)A('^)(0)*, H : 



(see Appendix B in Supplementary Material 



E2 T2 

.T2 r2. 

for details). Defining H(/) = limr^ o TH''^>(27rr/), 
E = limj.^0 7^;'^', we obtain the limit expressions for 
pI^X y, Fy^X X and F^^y , which are 



'-y^x 



In- 



Pxxif) 



df. 



In- 



Pyy(f) 



Hyy{f)r2H*{f) 



Pxxif)Pyy{f) 



[fl^{f)i2H*,{f)\ [flyy{f)r2H;^{f)\ 



(26) 

(27) 
df (28) 



Then, if /j'^ln[l — C(f)\ df is finite, we can easily show 
and ]:Fx^y all approach finite val- 



that ip':', 

ues in the limit of r ^ 0. Therefore, the Granger causality is 
linearly proportional to the sampling interval length t for small 
T. Hence, vanishing Px^^y, P^^Xy, Py^Xx and as r ^0. The 
corresponding limits are related to the intrinsic properties of the 
continuous-time processes. 



4. DISCUSSION AND CONCLUSION 

4.1. SELECTION OF SAMPLING INTERVALS IN THE APPLICATION OF GC 

From our results above, it becomes evident that one has to be 
careful in interpreting causal inference using the GC analysis. 
For sufficiently large sampling interval length, the GC in general 
decays exponentially to zero because much of information over 
dominant frequencies is lost and the GC value becomes small and 
unreliable for causal inference. When there is causal flow, to avoid 
a possible vanishing GC value due to zeroes in the oscillations 
in the GC structure, one should use a range of sampling interval 



lengths to obtain discrete time series to ascertain the general fea- 
tures of the GC sampling structure, such as Figure 2, so as to avoid 
using accidental vanishing GC values for causal inference and 
to obtain a reliable interpretation of causality. When there is no 
causal flow, spurious causality may also arise (e.g.. Figures 2, 6) 
when the sampling interval length is large because the loss of high 
frequency information for self prediction of one time series can 
possibly be compensated by the other time series. 

In order to preserve the high frequency information, one 
would use very fine sampling intervals to sample a continuous- 
time process. However, as we have pointed out above that the 
GC values exhibit a linear relation with r as the sampling interval 
vanishes, thus, again leading to seemingly unreliable inference of 
causality. We can circumvent this difficulty by using the following 
procedure. 

First, for a range of small r's, we ascertain the linear range of 
the GC value as a function of r, then we plot the ratio of the GC 
value to the corresponding sampling interval length r to extract 
its limiting value as r — ^ 0, e.g., by Equation (25), Figure 7 shows 
such an example for time series sampled from the I&F network 
dynamics. It is clear that the directed couplings of the neuronal 
network dynamics are correctly inferred by the clearly non-zero 
ratio for the presence of the synaptic coupling and a vanishing 
ratio for the absence of synaptic coupling. According to asymp- 
totic distributions of GCs, the GC computed from time series 
is a biased estimate, e.g., the bias is ^ for Fx^y, where p is 
the regression order, n is the length of time series. To plot the 
ratio of GC to r, we used the numerical computed GC with 
the bias subtracted (see Appendix C in Supplementary Material 
for details). 

In applications, we may wonder how we determine this lin- 
ear range and whether this linear range is sufficiently large for 

the GC extraction. Here, we present an estimation using Equation 

1 

(25). The idea is that if f '\ In [l — C(/)] df is a good approxi- 

mation of fX^ In [l — C(/)] df for some tq, which implies that 
the causal information with frequency higher than can be 
ignored, then, tq is inside the linear range for small GCs. We can 




0.5 1 
-u (ms) 



FIGURE 7 I (A) Coherence vs. frequency f. (b) If^'U y (red), 1fJ,'U ^ 
(cyan), 1 F^Jy (red dash), 1 Fi'' y (cyan dash) for voltage time series and 
-Fx^y computed through Equation (25) (black horizontal line) vs. sampling 
interval length z. The time series are generated by the I&F network whose 
topology is shown in Figure 1A with parameters v = 1 ms"^ , s = 0.02, 
X = 0.0177. Note that we have subtracted the estimation bias of GC from 
the estimate of GC values for (B). The procedure of removing biases is 
described in Appendix C in Supplementary Material. 
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determine such range by inspecting directly the figure of C(f ) and 
choose a proper cut-off frequency /o such that C(/) is sufficiently 
small, thus can be ignored when [/[ > fo- We take the voltage time 
series of a two-neuron I&F network as an example. Figure 7 A 
displays the coherence function C(/). It can be seen from the fig- 
ure that we can choose the cut-off frequency /o = 500 Hz, above 
which the coherence almost vanishes. Then, we can conclude that 
the linear range for GC is from 0 ms to ~1 ms. Figure 7B demon- 
strates the validity of our estimate where ^Fx^,y is approximately a 
constant over the range from 0 ms to ~ 1 ms of the sampling inter- 
val. Therefore, it is sufficient to use voltage time series sampled 
with time interval 0.5 ms or less to extract good approximations 
of the limiting GC-to-r ratios. We note in passing that, in the 
Method section, we used r = 0.5 ms to evaluate the GC values 
for the network reconstruction. 

4.2. CONCLUSIONS 

In this work, we have discussed general features of the GC 
sampling structure and their strong implications for causal 
inference by using the GC analysis. We have also discussed a 
general approach that can overcome these artifacts arising from 
sampling interval lengths to obtain reliable GC values. We note 
that these issues arise when we sample continuous-time pro- 
cesses to obtain discrete time series for the GC analysis. Therefore, 
the issues should be examined regardless of whether one uses 
parametric or non-parametric methods for the GC estimation. 
Furthermore, the general strategies of overcoming these sampling 
issues are not limited to the bivariate time series with unidi- 
rectional connection as discussed in this work. They are also 
applicable to the GC analysis of bivariate time series with bidi- 
rectional connections (shown in Supplementary Figure 2) as well 
as multivariate time series with any general connectivity structure 
(Zhou etal, 2013b, 2014). 
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